16 research outputs found
TransRate: reference-free quality assessment of de novo transcriptome assemblies.
TransRate is a tool for reference-free quality assessment of de novo transcriptome assemblies. Using only the sequenced reads and the assembly as input, we show that multiple common artifacts of de novo transcriptome assembly can be readily detected. These include chimeras, structural errors, incomplete assembly, and base errors. TransRate evaluates these errors to produce a diagnostic quality score for each contig, and these contig scores are integrated to evaluate whole assemblies. Thus, TransRate can be used for de novo assembly filtering and optimization as well as comparison of assemblies generated using different methods from the same input reads. Applying the method to a data set of 155 published de novo transcriptome assemblies, we deconstruct the contribution that assembly method, read length, read quantity, and read quality make to the accuracy of de novo transcriptome assemblies and reveal that variance in the quality of the input data explains 43% of the variance in the quality of published de novo transcriptome assemblies. Because TransRate is reference-free, it is suitable for assessment of assemblies of all types of RNA, including assemblies of long noncoding RNA, rRNA, mRNA, and mixed RNA samples
Ancestral light and chloroplast regulation form the foundations for C4 gene expression.
C4 photosynthesis acts as a carbon concentrating mechanism that leads to large increases in photosynthetic efficiency. The C4 pathway is found in more than 60 plant lineages1 but the molecular enablers of this evolution are poorly understood. In particular, it is unclear how non-photosynthetic proteins in the ancestral C3 system have repeatedly become strongly expressed and integrated into photosynthesis gene regulatory networks in C4 leaves. Here, we provide clear evidence that in C3 leaves, genes encoding key enzymes of the C4 pathway are already co-regulated with photosynthesis genes and are controlled by both light and chloroplast-to-nucleus signalling. In C4 leaves this regulation becomes increasingly dependent on the chloroplast. We propose that regulation of C4 cycle genes by light and the chloroplast in the ancestral C3 state has facilitated the repeated evolution of the complex and convergent C4 trait.The work was funded by the European Union 3to4 project and Biotechnology and Biological Sciences Research Council (BBSRC) grant BB/J011754/1. I.G.-M. was supported by the Amgen Foundation. Research on chloroplast signalling by M.J.T. was supported by BBSRC grant (BB/J018139/1).This is the author accepted manuscript. The final version is available from the Nature Publishing Group via http://dx.doi.org/10.1038/nplants.2016.16
Shared characteristics underpinning C 4 leaf maturation derived from analysis of multiple C 3 and C 4 species of Flaveria
Most terrestrial plants use C3 photosynthesis to fix carbon. In multiple plant lineages a modified system known as C4 photosynthesis has evolved. To better understand the molecular patterns associated with induction of C4 photosynthesis, the genus Flaveria that contains C3 and C4 species was used. A base to tip maturation gradient of leaf anatomy was defined, and RNA sequencing was undertaken along this gradient for two C3 and two C4Flaveria species. Key C4 traits including vein density, mesophyll and bundle sheath cross-sectional area, chloroplast ultrastructure, and abundance of transcripts encoding proteins of C4 photosynthesis were quantified. Candidate genes underlying each of these C4 characteristics were identified. Principal components analysis indicated that leaf maturation and the photosynthetic pathway were responsible for the greatest amount of variation in transcript abundance. Photosynthesis genes were over-represented for a prolonged period in the C4 species. Through comparison with publicly available data sets, we identify a small number of transcriptional regulators that have been up-regulated in diverse C4 species. The analysis identifies similar patterns of expression in independent C4 lineages and so indicates that the complex C4 pathway is associated with parallel as well as convergent evolution
Recommended from our members
NRG1 fusions in breast cancer
Funder: Cancer Research UKFunder: Breast Cancer NowFunder: Addenbrookes Charitable TrustFunder: Mark Foundation For Cancer Research; doi: http://dx.doi.org/10.13039/100014599Funder: Addenbrooke's Charitable Trust, Cambridge University Hospitals (GB)Abstract: Background: NRG1 gene fusions may be clinically actionable, since cancers carrying the fusion transcripts can be sensitive to tyrosine kinase inhibitors. The NRG1 gene encodes ligands for the HER2(ERBB2)-ERBB3 heterodimeric receptor tyrosine kinase, and the gene fusions are thought to lead to autocrine stimulation of the receptor. The NRG1 fusion expressed in the breast cancer cell line MDA-MB-175 serves as a model example of such fusions, showing the proposed autocrine loop and exceptional drug sensitivity. However, its structure has not been properly characterised, its oncogenic activity has not been fully explained, and there is limited data on such fusions in breast cancer. Methods: We analysed genomic rearrangements and transcripts of NRG1 in MDA-MB-175 and a panel of 571 breast cancers. Results: We found that the MDA-MB-175 fusion—originally reported as a DOC4(TENM4)-NRG1 fusion, lacking the cytoplasmic tail of NRG1—is in reality a double fusion, PPP6R3-TENM4-NRG1, producing multiple transcripts, some of which include the cytoplasmic tail. We hypothesise that many NRG1 fusions may be oncogenic not for lacking the cytoplasmic domain but because they do not encode NRG1’s nuclear-localised form. The fusion in MDA-MB-175 is the result of a very complex genomic rearrangement, which we partially characterised, that creates additional expressed gene fusions, RSF1-TENM4, TPCN2-RSF1, and MRPL48-GAB2. We searched for NRG1 rearrangements in 571 breast cancers subjected to genome sequencing and transcriptome sequencing and found four cases (0.7%) with fusions, WRN-NRG1, FAM91A1-NRG1, ARHGEF39-NRG1, and ZNF704-NRG1, all splicing into NRG1 at the same exon as in MDA-MB-175. However, the WRN-NRG1 and ARHGEF39-NRG1 fusions were out of frame. We identified rearrangements of NRG1 in many more (8% of) cases that seemed more likely to inactivate than to create activating fusions, or whose outcome could not be predicted because they were complex, or both. This is not surprising because NRG1 can be pro-apoptotic and is inactivated in some breast cancers. Conclusions: Our results highlight the complexity of rearrangements of NRG1 in breast cancers and confirm that some do not activate but inactivate. Careful interpretation of NRG1 rearrangements will therefore be necessary for appropriate patient management
Recommended from our members
NRG1 fusions in breast cancer
Funder: Cancer Research UKFunder: Breast Cancer NowFunder: Addenbrookes Charitable TrustFunder: Mark Foundation For Cancer Research; doi: http://dx.doi.org/10.13039/100014599Funder: Addenbrooke's Charitable Trust, Cambridge University Hospitals (GB)Abstract: Background: NRG1 gene fusions may be clinically actionable, since cancers carrying the fusion transcripts can be sensitive to tyrosine kinase inhibitors. The NRG1 gene encodes ligands for the HER2(ERBB2)-ERBB3 heterodimeric receptor tyrosine kinase, and the gene fusions are thought to lead to autocrine stimulation of the receptor. The NRG1 fusion expressed in the breast cancer cell line MDA-MB-175 serves as a model example of such fusions, showing the proposed autocrine loop and exceptional drug sensitivity. However, its structure has not been properly characterised, its oncogenic activity has not been fully explained, and there is limited data on such fusions in breast cancer. Methods: We analysed genomic rearrangements and transcripts of NRG1 in MDA-MB-175 and a panel of 571 breast cancers. Results: We found that the MDA-MB-175 fusion—originally reported as a DOC4(TENM4)-NRG1 fusion, lacking the cytoplasmic tail of NRG1—is in reality a double fusion, PPP6R3-TENM4-NRG1, producing multiple transcripts, some of which include the cytoplasmic tail. We hypothesise that many NRG1 fusions may be oncogenic not for lacking the cytoplasmic domain but because they do not encode NRG1’s nuclear-localised form. The fusion in MDA-MB-175 is the result of a very complex genomic rearrangement, which we partially characterised, that creates additional expressed gene fusions, RSF1-TENM4, TPCN2-RSF1, and MRPL48-GAB2. We searched for NRG1 rearrangements in 571 breast cancers subjected to genome sequencing and transcriptome sequencing and found four cases (0.7%) with fusions, WRN-NRG1, FAM91A1-NRG1, ARHGEF39-NRG1, and ZNF704-NRG1, all splicing into NRG1 at the same exon as in MDA-MB-175. However, the WRN-NRG1 and ARHGEF39-NRG1 fusions were out of frame. We identified rearrangements of NRG1 in many more (8% of) cases that seemed more likely to inactivate than to create activating fusions, or whose outcome could not be predicted because they were complex, or both. This is not surprising because NRG1 can be pro-apoptotic and is inactivated in some breast cancers. Conclusions: Our results highlight the complexity of rearrangements of NRG1 in breast cancers and confirm that some do not activate but inactivate. Careful interpretation of NRG1 rearrangements will therefore be necessary for appropriate patient management
The Pfam protein families database
Peer reviewe
Recommended from our members
Imaging breast cancer using hyperpolarized carbon-13 MRI.
Our purpose is to investigate the feasibility of imaging tumor metabolism in breast cancer patients using 13C magnetic resonance spectroscopic imaging (MRSI) of hyperpolarized 13C label exchange between injected [1-13C]pyruvate and the endogenous tumor lactate pool. Treatment-naïve breast cancer patients were recruited: four triple-negative grade 3 cancers; two invasive ductal carcinomas that were estrogen and progesterone receptor-positive (ER/PR+) and HER2/neu-negative (HER2-), one grade 2 and one grade 3; and one grade 2 ER/PR+ HER2- invasive lobular carcinoma (ILC). Dynamic 13C MRSI was performed following injection of hyperpolarized [1-13C]pyruvate. Expression of lactate dehydrogenase A (LDHA), which catalyzes 13C label exchange between pyruvate and lactate, hypoxia-inducible factor-1 (HIF1α), and the monocarboxylate transporters MCT1 and MCT4 were quantified using immunohistochemistry and RNA sequencing. We have demonstrated the feasibility and safety of hyperpolarized 13C MRI in early breast cancer. Both intertumoral and intratumoral heterogeneity of the hyperpolarized pyruvate and lactate signals were observed. The lactate-to-pyruvate signal ratio (LAC/PYR) ranged from 0.021 to 0.473 across the tumor subtypes (mean ± SD: 0.145 ± 0.164), and a lactate signal was observed in all of the grade 3 tumors. The LAC/PYR was significantly correlated with tumor volume (R = 0.903, P = 0.005) and MCT 1 (R = 0.85, P = 0.032) and HIF1α expression (R = 0.83, P = 0.043). Imaging of hyperpolarized [1-13C]pyruvate metabolism in breast cancer is feasible and demonstrated significant intertumoral and intratumoral metabolic heterogeneity, where lactate labeling correlated with MCT1 expression and hypoxia
Discovery of the First Insect Nidovirus, a Missing Evolutionary Link in the Emergence of the Largest RNA Virus Genomes
Nidoviruses with large genomes (26.3–31.7 kb; ‘large nidoviruses’), including Coronaviridae and Roniviridae, are the most complex positive-sense single-stranded RNA (ssRNA+) viruses. Based on genome size, they are far separated from all other ssRNA+ viruses (below 19.6 kb), including the distantly related Arteriviridae (12.7–15.7 kb; ‘small nidoviruses’). Exceptionally for ssRNA+ viruses, large nidoviruses encode a 3′-5′exoribonuclease (ExoN) that was implicated in controlling RNA replication fidelity. Its acquisition may have given rise to the ancestor of large nidoviruses, a hypothesis for which we here provide evolutionary support using comparative genomics involving the newly discovered first insect-borne nidovirus. This Nam Dinh virus (NDiV), named after a Vietnamese province, was isolated from mosquitoes and is yet to be linked to any pathology. The genome of this enveloped 60–80 nm virus is 20,192 nt and has a nidovirus-like polycistronic organization including two large, partially overlapping open reading frames (ORF) 1a and 1b followed by several smaller 3′-proximal ORFs. Peptide sequencing assigned three virion proteins to ORFs 2a, 2b, and 3, which are expressed from two 3′-coterminal subgenomic RNAs. The NDiV ORF1a/ORF1b frameshifting signal and various replicative proteins were tentatively mapped to canonical positions in the nidovirus genome. They include six nidovirus-wide conserved replicase domains, as well as the ExoN and 2′-O-methyltransferase that are specific to large nidoviruses. NDiV ORF1b also encodes a putative N7-methyltransferase, identified in a subset of large nidoviruses, but not the uridylate-specific endonuclease that – in deviation from the current paradigm - is present exclusively in the currently known vertebrate nidoviruses. Rooted phylogenetic inference by Bayesian and Maximum Likelihood methods indicates that NDiV clusters with roniviruses and that its branch diverged from large nidoviruses early after they split from small nidoviruses. Together these characteristics identify NDiV as the prototype of a new nidovirus family and a missing link in the transition from small to large nidoviruses
Recommended from our members
TransRate: reference-free quality assessment of de novo transcriptome assemblies.
TransRate is a tool for reference-free quality assessment of de novo transcriptome assemblies. Using only the sequenced reads and the assembly as input, we show that multiple common artifacts of de novo transcriptome assembly can be readily detected. These include chimeras, structural errors, incomplete assembly, and base errors. TransRate evaluates these errors to produce a diagnostic quality score for each contig, and these contig scores are integrated to evaluate whole assemblies. Thus, TransRate can be used for de novo assembly filtering and optimization as well as comparison of assemblies generated using different methods from the same input reads. Applying the method to a data set of 155 published de novo transcriptome assemblies, we deconstruct the contribution that assembly method, read length, read quantity, and read quality make to the accuracy of de novo transcriptome assemblies and reveal that variance in the quality of the input data explains 43% of the variance in the quality of published de novo transcriptome assemblies. Because TransRate is reference-free, it is suitable for assessment of assemblies of all types of RNA, including assemblies of long noncoding RNA, rRNA, mRNA, and mixed RNA samples
Transcript residency on ribosomes reveals a key role for the Arabidopsis thaliana bundle sheath in sulfur and glucosinolate metabolism
Leaves of angiosperms are made up of multiple distinct cell types. While the function of mesophyll cells, guard cells, phloem companion cells and sieve elements are clearly described, this is not the case for the bundle sheath (BS). To provide insight into the role of the BS in the C-3 species Arabidopsis thaliana, we labelled ribosomes in this cell type with a FLAG tag. We then used immunocapture to isolate these ribosomes, followed by sequencing of resident mRNAs. This showed that 5% of genes showed specific splice forms in the BS, and that 15% of genes were preferentially expressed in these cells. The BS translatome strongly implies that the BS plays specific roles in sulfur transport and metabolism, glucosinolate biosynthesis and trehalose metabolism. Much of the C-4 cycle is differentially expressed between the C-3 BS and the rest of the leaf. Furthermore, the global patterns of transcript residency on BS ribosomes overlap to a greater extent with cells of the root pericycle than any other cell type. This analysis provides the first insight into the molecular function of this cell type in C-3 species, and also identifies characteristics of BS cells that are probably ancestral to both C-3 and C-4 plants